DCU's Experiments for the NTCIR-8 IR4QA Task

نویسندگان

  • Jinming Min
  • Jie Jiang
  • Johannes Leveling
  • Gareth J. F. Jones
  • Andy Way
چکیده

We describe DCU’s participation in the NTCIR-8 IR4QA task [16]. This task is a cross-language information retrieval (CLIR) task from English to Simplified Chinese which seeks to provide relevant documents for later cross language question answering (CLQA) tasks. For the IR4QA task, we submitted 5 official runs including two monolingual runs and three CLIR runs. For the monolingual retrieval we tested two information retrieval models. The results show that the KL-Divergence language model method performs better than the Okapi BM25 model for the Simplified Chinese retrieval task. This agrees with our previous CLIR experimental results at NTCIR-5. For the CLIR task, we compare query translation and document translation methods. In the query translation based runs, we tested a method for query expansion from external resource (QEE) before query translation. Our result for this run is slightly lower than the run without QEE. Our results show that the document translation method achieves 68.24% MAP performance compared to our best query translation run. For the document translation method, we found that the main issue is the lack of named entity translation in the documents since we do not have a suitable parallel corpus for training data for the statistical machine translation system. Our best CLIR run comes from the combination of query translation using Google translate and the KL-Divergence language model retrieval method. It achieves 79.94% MAP relative to our best monolingual run.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Overview of the NTCIR-7 ACLIA IR4QA Task

This paper presents an overview of the IR4QA (Information Retrieval for Question Answering) Task of the NTCIR-7 ACLIA (Advanced Cross-lingual Information Access) Task Cluster. IR4QA evaluates traditional ranked retrieval of documents using wellstudied metrics such as Average Precision, but the retrieval task is embedded in the context of cross-lingual question answering. That is, document retri...

متن کامل

Are Popular Documents More Likely To Be Relevant? A Dive into the ACLIA IR4QA Pools

The ACLIA IR4QA Task at NTCIR-7 is an ad hoc document retrieval task involving three document languages. Although IR4QA used pooling for collecting relevance assessments, it was unique in that the pooled documents were sorted before presenting them to the assessors, based on the assumption that “popular” documents are more likely to be relevant than others. We show that this assumption is indee...

متن کامل

An Open-domain Question Answering System for NTCIR-8 C-C Task

In this paper, we described our CCLQA system and the evaluation results for the C-C task at NTCIR-8 ACLIA. The system consists of a Question Analysis module, IR module and Answer Extraction module. The Question Analysis module was developed for NTCIR-7 CCLQA, which is based on the Question pattern library and HowNet. The IR module was developed for NTCIR-8 IR4QA task, and the results of KECIR-C...

متن کامل

Statistical Machine Translation based Passage Retrieval - Experiment at NTCIR-7 IR4QA Task

In this paper, we apply the statistical machine translation based passage retrieval, which was proposed at the last NTCIR-6 CLQA subtask, to the IR4QA Task. The experimental evaluation shows that the method is more effective for the relation and event type questions, which are longer and including relatively mane common keywords, than the definition and biography type questions, which are short...

متن کامل

NTCIR-7 ACLIA IR4QA Results based on Qrels Version 2

This document is a postscript to the Overview of the NTCIR-7 ACLIA IR4QA Task [2]. At the NTCIR7 Workshop Meeting (December 2008), participating systems of IR4QA were evaluated based on “qrels version 1,” which covered the depth-30 pool for every topic and went further down the pool for a limited number of topics. Here, we report on revised results based on “qrels version 2” which covers the de...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010